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TITLE OF THE INVENTION ; 

RECOMBINATION-FACILITATED MULTIPLEX ANALYSIS OF 

DNA FRAGMENTS 

FIELD OF TOE INVENTION ; 

5 The invention relates to the use of recombinases, and 

in particular, the Cre protein of bacteriophage PI and its 
loxP DNA recombinational site, to facilitate the 
restriction fragment analysis and sec[uencing of a DNA 
molecule. 

10 BACKGROUND OF THE INVENTION ; 

The techniques of molecular biology were developed to 
analyze relatively small DNA molecules. Increasingly, 
however, research has centered on the analysis of larger 
and larger DNA molecules, such as the chromosomes of 

15 mammals, and in particular, the chromosomes of the human 
genome. The analysis of such large DNA molecules has 
often been limited by the ease with which the initially 
developed technology could be adapted to permit the 
analysis of such extremely large molecules. Such methods 

20 have exploited the ability of restriction endonucleases to 
produce fragments of the DNA molecule which would then be 
more amenable to sequence analysis. 
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I. THE MAPPING OF ' RESTRICTION ENDONUCLEASE 
RECOGNITION SITES 

One initial objective in the analysis of DNA 
molecules is to produce a gross physical map of the DNA. 
5 For small DNA molecules^ this may be readily achieved 
using restriction endonucleases to identify and orient the 
corresponding recognition sites of such enzymes. Methods 
for performing such "restriction mapping" are well-known 
( see , for example , Perbal , B . A Practical Guide to 

10 Molecular Cloning , John Wiley & Sons, NY, (1984), pp. 208- 
216; Maniatis, T., et al. (In: Molecular Cloning > A 
Laboratory Manual , Cold Spring Harbor Press, Cold Spring 
Harbor, NY (1982) , both herein incorporated by reference) . 
As will be apparent , the complexity of the data 

15 obtained in "restriction mapping" a target molecule 
increases rapidly as the size of the target molecule 
increases. For this reason, it is usually necessary to 
employ several strategies when attempting to obtain 
detailed maps of a Isorge DNA molecule. 

20 One strategy which is employed involves 

simultaneously digesting a target DNA molecule with 
combinations of several restriction enzymes each of which 
is expected to cleave at only a small number of sites in 
the target. For linear target molecules, the number of 

25 fragments equals the number of restriction enzyme sites 
plus 1. For a circular DNA molecule, the number of 
doxible-digestion fragments equals the niimber of fragments 
generated by the first enzyme plus the number of fragments 
generated by the second enzyme. Restriction maps are 

30 created from the data by a process which is part logic, 
and part trial and error (Lawn, R.M. et al . . Cell 15; 1157 
(1978)) . 

The process of creating a restriction map may be 
facilitated by a sequential analysis of fragments. In 
35 this method, one treats a target molecule with a first 
restriction endonuclease, isolates the digestion products. 
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and then subjects the purifi d products to digest! n with 
a s cond endonuclease . Such steps can be perform d 

rapidly and efficiently (Parker, R.C. et al. , Met. 

Enzvmol. 65! 358 (1980)). 
5 In lieu of obtaining a complete endonuclease 

digestion of a target molecule, considerable information 
can be obtained by incubating the target molecule with 
endonucleases under conditions resulting in limited 
digestion. Two approaches have been developed. In the 

10 first, the aim is to compare the sizes of the partial- and 
complete-digestion products with one another, and to 
deduce which fragments might be adjacent to one another in 
the target molecule. In general, the number of partial- 
digestion products (F) of a linear DNA molecule that 

15 contains N+l restriction sites (where N>0) is given by the 
formula: 

— 



For a molecule having 20 sites, 209 partial-digestion 
products are obtainable. Thus, for large DNA molecules, 
the method is of very limited utility since the number of 
possible partial-digestion products quickly becomes 
unmanageable. 

One means for simplifying such an analysis was 
proposed by Smith, H.O. et al . . Nucleic Acids Rag. 
3:2387 (1976), herein incorporated by reference). This 
method uses a teorget molecule which has been labeled with 
^^P at one of its termini. Digestion products are 
visualized by autoradiography after electrophoresis in 
agarose gels. Digestion products which are not linked to 
the labelled termini are not detected by the analysis. 
Thus, the number of labelled partial-digestion products is 
equal to the number of r striction sites within the targ t 
molecule. Moreover, the labelled fragments form a simple, 
overlapping ladder, with a common labelled terminus. The 
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order of ascension of the fragments corresponds to the 
order of restriction sites in the target molecul . 
Lastly, partial-digestion products produced through the 
action of several different enzymes can be analyzed 
5 simultaneously on the same gel. 

One deficiency of the method is the difficulty which 
is often encountered in specifically labelling a single 
end of a target molecule. Indeed, due to the symmetrical 
nature of a doxible-stranded DNA molecule, both ends of the 
10 molecule are equally available for lsa>elling. Thus, in 
practice, considerable difficulty is encountered in 
labelling only one end of a lineeur DNA molecule. 

II. THE SEQUENCING OF DNA MOLECULES 

Initial attempts to determine the sequence of a DNA 

15 molecule were extensions of techniques which had been 
initially developed to permit the sequencing of RNA 
molecules (Sanger, F. , J. Mol. Biol, 13:373 (1965); 
Brownlee, G.G. et al . > J. Mol, Biol, 34;379 (1968)). Such 
methods involved the specific cleavage of DNA into smaller 

20 fragments by (1) enzymatic digestion (Robertson, H.D. et 
al. , Nature New Biol, 241 :38 (1973); Ziff, E.B. et al . , 
Nature Nevr Biol. 241:34 (1973)); (2) nearest neighbor 
analysis (Wu, R. , et al ., J. Mol, Biol. 57;49l (1971)), 
and (3) the "Wanderings Spot" method (Sanger, F. , Proc. 

25 Natl. Acad. Sci. fU,S,A.) 70:1209 (1973) ) . 

More recent advances in DNA sequencing have led to 
the development of two , highly utilized methods for 
elucidating the sequence of a DNA molecule: the "Dideoxy- 
Mediated Chain Termination Method," also known as the 

30 "Sanger Method" (Sanger, F., et al . . J, Mol. Biol, 94:441 
(1975)) and the "Maxam-Gilbert Chemical Degradation 
Method" (Maxam, A.M. , et al , , Proc, Natl. Acad, Sci> 
(U.S. A,) 74:560 (1977), both references h rein 
incorporated by reference) . 



wo 92/22650 



PCr/US92/04923 



A. DIDEOXY-MEDIATfiD CHAIN TERMINATION METHOD 
OF DNA SEQUENCING 

In the dideoxy-mediated or "Sanger" chain termination 
method of DNA sequencing, the sequence of a DNA molecule 
5 is obtained through the extension of an oligonucleotide 
primer which is hybridized to the nucleic acid molecule 
being sequenced. In brief, foxir separate primer extension 
reactions are conducted. In each reaction, a DNA 
polymerase is added along with the four nucleotide 

10 triphosphates needed to polymerize DNA* Each of the 
reactions is carried out in the additional presence of a 
2*, 3' dideoxy derivative of the A, T, C, or G nucleoside 
triphosphates. Such derivatives differ from conventional 
nucleotide triphosphates in that they lack a hydroxyl 

15 residue at the 3* position of deoxyribose. Thus, although 
they can be incorporated by a DNA polymerase into the 
newly synthesized primer extension, the absence of the 3» 
hydroxyl group causes them to be incapable of forming a 
phosphodiester bond with a succeeding nucleotide 

20 triphosphate. Thus, the incorporation of a dideoxy 
derivative results in the termination of the extension 
reaction. Since the dideoxy derivatives are present in 
lower concentrations than their corresponding, 
conventional nucleotide triphosphate analogs, the net 

25 result of each of the four reactions is to produce a set 
of nested oligonucleotides each of which is terminated by 
the particular dideoxy derivative used in the reaction. 
By subjecting the reaction products of each of the 
extension reactions to electrophoresis, it is possible to 

30 obtain a series of four "ladders." Since the position of 
each "rung" of the ladder is determined by the size of the 
molecule, and since such size is determined by the 
incorporation of the dideoxy derivative, the appearance 
and location of a particular "rung" can be readily 

35 translated into the sequence of the extended primer. 
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Thus, through an electrophoretic analysis, the s quence of 
the extended primer can b determined. 

one deficiency of the dideoxy-mediated sequencing 
method is the need to optimize the ratio of dideoxy 
5 nucleoside triphosphates to conventional nucleoside 
triphosphates in the chain-extension/ chain-termination 
reactions. Such adjustments are needed in order to 
maximize the amount of information which can be obtained 
from each primer. Additionally, the efficiency of dideoxy 

10 nucleotide incorporation in a particular target molecule 
is peuctially dependent upon the primary and secondsucy 
structures of the target. 

The dideoxy-mediated method thus recjuires single- 
stranded templates, specific oligonucleotide primers, and 

15 high quality preparations of a DNA polymerase (typically 
the Klenow fragment of E. coli DNA polymerase I). 
Initially, these requirements delayed the wide spread use 
of the method. However, with the ready availability of 
synthetic primers, and the availability of bacteriophage 

20 M13 and phagemid vectors (Meiniatis, T. , et al . > Molecular 
Cloning, a Laboratorv Manual, 2nd E dition, Cold gprjnq 
Harbor Press , Cold Spring Harbor, New York (1989), herein 
incorporated by reference) , the dideoxy-mediated chain 
termination method is now extensively employed. 

25 B. THE MAX&M-GILBERT METHOD OF DNA SEQUENCING 

The Maxam-Gilbert method of DNA sequencing is a 
degradative method. In this procedure, a fragment of DNA 
is labeled at one end and partially cleaved in four 
separate chemical reactions, each of which is specific for 

30 cleaving the DNA molecule at a particular base (G or C) at 
a particular type of base (A/G, C/T, or A>C) . As in the 
above-described dideoxy method, the effect of such 
reactions is to create a set f n st d molecules whose 
lengths are determined by the locations of a particular 

35 base along the length of the DNA molecule being sequenced. 
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The nested reaction products ate then resolved by lectro- 
phor sis, and the end-label d mol cules are d tected, 
typically by autoradiography when a '^P label is employed. 
Four single lanes are typically required in order to 
determine the sequence. 

The Haxam-Gilbert method thus uses simple chemical 
reagents which are readily available. Nevertheless, the 
dideoxy-mediated method has several advantages over the 
Maxam-Gilbert method. The Haxam-Gilbert method is 
extremely laborious and requires meticulous experimental 
technique. In contrast, the Sanger method may be employed 
on larger nucleic acid molecules. 

Significantly, in the Maxam-Gilbert method the 
sequence is obtained from the original DNA molecule, and 
not from an enzymatic copy. For this reason, the method 
can be used to sequence synthetic oligonucleotides, and to 
analyze DNA modifications such as methylation, etc. It 
can also be used to study both DNA secondary structure and 
protein-DNA interactions. Indeed, it has been readily 
employed in the identification of the binding sites of DNA 
binding proteins. 

Methods for sequencing DNA using either the dideoxy- 
mediated method or the Maxam-Gilbert method are widely 
known to those of ordinary skill in the art. Such methods 
are, for example, disclosed in Maniatis, T., et_al.. 
Molecular Cloning, a Labor atory Manual, 2nd Edition , cold 
Spring Harbor Press . Cold Spring Harbor, New York (1989), 
and in Zyskind, J.W., et_al . , Recombinant DNA Laboratory 
Manual. Academic Press, Inc. , New York (1988), both herein 
incorporated by reference. 
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III. THE ANALYSIS OF LASGE DNA SEQUENCES 

Both the above-described dideoxy-mediated method and 
the Maxam-Gilbert method of DNA sequencing require the 
prior isolation of the DNA molecule which is to be 
5 sequenced. The sequence information is obtained by 
subjecting the reaction products to electrophoretic 
analysis (typically using polyacrylamide gels). Thus, a 
sample is applied to a lane of a gel, and the various 
species of nested fragments are sepearated from one another 

10 by their migration velocity through the gel. The niunber 
of nested fragments which can be separated in a single 
lane is approximately 200-300 regardless of whether the 
Sanger or the Maxam-Gilbert method is used. Those of 
great skill in the art can separate up to 600 fragments in 

15 a single lane. Thus, in order to sequence large DNA 
molecules, it is necesseury to fragment the molecule, and 
to sequence the fragments in separate lanes of the 
sequencing gel. The sequence of the entire molecule is 
obtained by orienting and ordering the sequence data 

20 obtained from each fragment. 

Two approaches have been employed by those of skill 
in this art to accomplish this goal. In a random or 
shotgun sequencing approach, secjuence data is collected by 
subcloning fragments of the target DNA molecule. No 

25 attempt is initially made to determine the linear 
orientation or order of the subclones with respect to the 
intact target DNA molecule. Instead, the accximulated data 
are stored and ultimately arranged into order by a 
computer (Staden, R. , Nucleic Acids Res. 14:217 (1986); 

30 Anderson, S. et al .. Nature 290 ;457 (1981); Gingeras, 
T.R., J- Biol, Chem. 257 ; 13475 (1982); Sanger, F. et al . . 
J, Mol. Biol. 162 ;729 (1982), and Baer, R. et al .. Nature 
310:207 (1984)). As will be appreciated, such random 
shotgun approaches often result in the multiple sequencing 

35 of the same oligonucleotide fragment, and thus are often 
inefficient in terms of time and materials. 
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In contrast, directed approaches have been employed 
In which sequences of the target DNA are btalned In a 
systematic fashion. For example, the target DNA molecule 
may be ordered by restriction mapping using the methods 
5 described above, and the discrete restriction fragments 
sequenced. Alternatively, the target molecule may be 
sequenced by sequencing nested sets of deletions which 
begin at one of its ends. The use of such nested 
fragments progressively brings more and more remote 
10 regions of the target DNA into range for sequencing. 
Lastly, sequence information obtained from a particular 
target molecule can be used to prepare a primer which can 
then be used in a subsequent sequencing reaction in order 
to obtain additional sequence information. As will be 
15 perceived, a directed sequence analysis of a target DNA 
molecule often requires substantial a priori information 
regarding the seq[uence. Moreover, for large target 
molecules (of sizes on the order of kilobases) such as 
would be encountered in the sec[uencing of eukaryotic (and 
20 in particular, mammalian) chromosomes, directional 
sequencing is quite arduous. 

Several strategies have been developed to facilitate 
the sequence analysis of large (multi-kilobase) gene 
sequences. In one strategy, a large DNA molecule is 
25 fragmented through the use of restriction endonucleases 
which cut at infrequent sites. Such action results in the 
production of a small number of fragments each of which 
contains a portion of the sequence present in the original 
DNA molecule. Due to their smaller size, such fragments 
30 are more amenable to sequence analysis (using the above- 
stated methods) than the original DNA molecule. The 
sequence of the entire molecule is obtained by orienting 
the fragments with respect to each other in order to 
produce a gross physical map of the target molecule 
35 (Schwartz, D.C. et al. . Cell 37:67-75 (1984); Southern, 
E.M. et al .. Nucleic Acids Pes . 15:5925-5943 (1987); 
Burke, D. et al. . Science 236:808-812 (1987); Olson, M.V. 
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et al ., Proc. Natl> Acad. Sci, fn,S,A,) fils 7826-7830 
(1986)). Since this proc dure reveals both the sequenc 
and the orientation of the fragments , it permits one to 
readily determine the sequence of the entire DNA molecule. 
5 Alternatively, a large DNA target may be subcloned 

into a large nximber of randomly selected bacteriophage or 
cosmid clones. Overlapping sequences in such clones are 
identified by unique restriction enzyme "fingerprinting" 
(Olsonr M.V. et al . , Proc> Nat l. Acad. Sci. fU.S.A.) 
10 51:7826-7830 (1986); Coulson, A. et al . , Proc. Natl. Acad. 
Sci. (U.S.A. 1 83:7821-7825 (1986)). The information is 
then used to assemble a map of overlapping sets of clones 
(Staden, R. , Nucleic Acids Res. 8:3673-3694 (1980) ) . This 
method has been successful in generating complete or par- 
15 tial maps of Saccharomvces cerevisiae chromosomes (Olson, 
M.V. et al .> Proc. Natl. Acad. Sci. fU. S.A.I 83:7826-7830 
(1986); C. Eleaans (Coulson, A. et al. , fyoc. Wat;^. Ac^q. 
sci. ru.S.A.) §3:7821-7825 (1986); Coulson, A. et al > , 
Nature 335:184-186 (1988)); and the E. coli chromosome 
20 (Kohara, Y. et al .. Cell 50:495-508 (1987)). 

Several factors may limit the use of conventional 
methods in the analysis of the nucleotide sequence of a 
target molecule. Typically, each leuie of a sequencing gel 
can resolve only about 300 different fragments. Thus, in 
25 order to determine the nucleotide sequence of a large DNA 
molecule, multiple sequencing gels are often needed. 
This, in txirn, limits the amovmt of new sequence 
information which can be readily obtained per day. For a 
large nucleic acid molecule, a substantial number of 
30 technically demanding and time consuming steps must be 
performed. In particular, since the above-described 
techniques are capable of analyzing only one set of nested 
oligonucleotides per sample, the sequencing of large DNA 
molecules requires the use of multiple sequencing gels 
35 each having a large number of lanes. The electrophoretic 
analysis step in the sequencing process thus comprises a 
significant limitation to the amount of sequence 
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Information which can be obtained and the rate with which 
it can be proc ss d. 

Similarly, the use of conventional methods in the 
analysis of restriction endonuclease-induced fragments of 
5 a target molecule is often not straightforward. 

In some cases, sequence and restriction fragment 
analysis is limited by the low copy number of each target 
DNA sequence in natural genomes. One method for 
overcoming this limitation is through the use of 
10 amplification techniques, such as the polymerase chain 
reaction, "PGR" (Mullis, K. et al . , Cold Spring Harbor 
Symp, Qu^nt. ^j.o3■. 51:263-273 (1986); Erlich H. et al. . EP 
50,424; EP 84,796, EP 258,017, EP 237,362; Mullis, K. , EP 
201,184; Mullis K. et al. . US 4,683,202; Erlich, H., US 
15 4,582,788; and Saiki, R. et al. . US 4,683,194), which 
references are incorporated herein by reference) , the 
technique cannot readily be applied to amplify every 
target molecule present in a large gene sequence. A 
method for detecting and/or measviring PCR amplification is 
20 disclosed by Brenner, S. et_al. , in International Patent 
Application Publication No. WO/11375. This method entails 
linking a loxP sequence to a target molecule during PCR 
amplification. Amplification is detected by inciibating 
the reacted molecules with Ore. 

25 IV. MULTIPLEX ANALYSIS 

A sxibstantial improvement in DNA sequencing 
technology was recently developed, and designated 
"multiplex DNA sequencing" (Church, G.M., et al . , Science 
240:185-188 (1988); Church, G.M. et al . , U.S. Patent 

30 4,942,124; both herein incorporated by reference). 
Multiplex DNA sequencing utilizes DNA libraries which are 
individually constructed in 20 different plasmid vectors. 
In addition to standard drug resistance and replication 
origin elements, each of the vectors has a cloning site 

35 flanked by two differ nt, predefined oligonucleotid 
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"tags" (i.e., forty total tags ar used with tw nty 
vectors) . These tags are in turn flanked by sites 
recognized by the NotI restriction endonuclease (which 
cuts only at infrequent sites) . The vectors differ from 
5 each other only by their tag sequences, which are 
originally selected from a random collection of chemically 
synthesized oligonucleotides- 

In accordance with the method, DNA is sonicated to 
produce fragments of 900-1500 base pairs. Such DNA is 

10 rendered ligatable through treatment with Bal 31 
exonuclease and then with T4 DNA polymerase and all four 
deoxynucleotide triphosphates. The DNA fragments are then 
ligated separately into each of the vectors and the 
ligation mixtures are used to transform E. coli cells. 

15 This procedure thus results in a formation of 20 gene 
libraries, which can then be amplified by conventional 
means. After amplification, the vectors are treated with 
NotI in order to excise the cloned DNA which is to be 
sequenced. Such excision produces DMA molecules having 

20 termini which are appropriate for the required subsequent 
chemical sequencing. The cloned DNA from each of the 
libraries is then mixed together to form a single pool 
containing each of the twenty members of the library. 

The sequence of the cloned DNA of the libraries is 

25 determined using the Maxam-Gilbert method. The pool of 20 
libraries is treated as a single unit in accordance with 
that method. The reaction products are then applied to a 
sequencing gel, and the oligonucleotides in the DNA sample 
are separated using gel electrophoresis. The DNA 

30 patterns, thus obtained, are then electro-transferred from 
the gels onto nylon membranes and crosslinked to the 
membranes using UV light. 

Since each lane of the gel contains the reaction 
products of the sequencing of 20 different DNA molecules, 

35 each lane contains 20 overlaid ladders of sequence 
information. Because the NotI fragment of the cloned DNA 
contains the tag region of the vector, each oligonucleo- 
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tide of a particular sequ nc ladder contains the tag 
region. A particular s qu nc ladder may thus be 
visualized by hybridizing a labelled probe for a 
particular tag to the DNA bound to the membrane. By 
5 washing the membrane with sodium dodecyl sulfate and EDTA, 
it is possible to remove the hybridized probe from the 
membranes. This step thus prepares the membranes to be 
used to analyze a second sequence ladder by hybridizing a 
labelled probe for a second particular tag to the DNA 
10 bound to the membrane. in this manner, the sequence 
information of twenty vectors can be ascertained from a 
sequencing gel. 

Thus, whereas conventional techniques permitted the 
sequencing of 300 bases per electrophoretic analysis, the 

15 multiplex DNA sequencing approach permits one to obtain 
sequence information of 6000 bases per analysis. • 

A significant advance in restriction endonuclease 
fragment analysis was recently disclosed by Giarcia, E. et 
^» (In: Genome Mapping and Sequencing ^ Abstracts of 

20 Meeting Proceedings, Cold Spring Harbor, May 2-6,. 1990, 
page 62). This reference concerns the use of a yeast 
artificial chromosome vector to clone large DNA sequences 
(such as from the hximan genome) . To allow direct end- 
labelling of the vectors, the vectors were constructed to 

25 contain a 34 bp loxP fragment. By incubating this vector 
in the presence of Cre and a short labelled oligonucleo- 
tide which contained a loxP sequence, it was possible to 
label the molecules in vitro . The use of radioisotopic or 
biotin labels was disclosed. 

30 Evans et al. have recently described a method which 

is potentially applicable to cloning, ordering clones, and 
the physical mapping of complex genomes (Evans, G.A. et 
Proc. Na tl. Acad. Sci. ru.S.A,^ 86 r SQ^Q^gn-^ii (1989), 
herein incorporated by reference) . Unfortunately, Evans 

35 et al. have elected to refer to this method as "multipl x 
analysis." The teirm "multiplex analysis" as used herein 
differs significantly from the "multiplex analysis" term 
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used by Evans et al > Thus, -although the Evans et aX« 
ref rence uses the same term as that used by the 
inventors, it describes a different technique. The use of 
the term by the present inventors is consistent with its 
5 use by Church et al . , and others in the art. 

In brief, the method described by Evans et al . uses 
a single cosmid library which is constructed by inserting 
random DNA fragments into a site adjacent to a T3 or T7 
bacteriophage promoters • Because these promoter sequences 

10 flank the cloned DNA (Wahl, G.N. et al . . Proc. Nat l. Acad. 
sci, fU.S.A.) M:2160-2164 (1987)), they can be used as 
probes to detect clones which have overlapping sequences. 

In siammary, a method which would minimize the number 
of gels needed for the determination of a particular 

15 sequence would, therefore, be highly desircible. 
Similarly, a method which would facilitate the 
construction of gross and fine restriction maps of a 
target molecule would also be highly desirable. Indeed, 
for the analysis of very large genomes, such as the human 

20 genome, the development of such methods may be essential. 

SUMMARY OF THE INVENTION ; 

As indicated above, the analysis of a target DNA 
molecule often entails fragmenting the molecule, and 
analyzing and sequencing the resultant fragments. 

25 Especially for large DNA molecules^ this is a difficult 
procedure. The present invention relates to an improved 
method for constructing gross eoid fine restriction maps of 
a target DNA molecule. 

The invention further relates to an improved method 

30 for determining the nucleotide sequence of a target DNA 
molecule. 

In detail, the invention provides a method for 
analyzing a target DNA molecule, which comprises: 

(A) forming a recombinant molecule, the recombinant 
35 molecule comprising a probe/primer sequence linked to a 
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recombinational site (I) , wherein the site is linked to 
the sequence of the target molecule; 

(B) analyzing the target molecule using a nucleic 
acid molecule cap£d>le of hybridizing to the probe/primer 
5 sequence, or its complement. 

The invention also provides the embodiment of the 
above method for determining nucleotide sequence of a 
target DNA molecule wherein the recombinant molecule is 
formed by: 

10 (1) introducing the target DNA molecule into at 

least one vector, having a recombinational site (II) , to 
thereby form a vector-target DNA construct; 

(2) incubating the vector-DNA construct in the 
presence of a recombinase, and a DMA molecule having the 

15 recombinational site (I) and a probe/primer region; 
wherein the incubation is xinder conditions sufficient to 
permit the recombinase to mediate recombination between 
the recombinational site (II) of the vector-teurget DNA 
construct and the recombinational site (I) of the DNA 

20 molecule; and 

(3) permitting the recombinase to mediate 
recombination between the recombinational sites, and to 
thereby form the sequencing molecule. 

The invention also provides the embodiments of the 
25 above methods for determining nucleotide sequence of a 
target DNA molecule wherein at least two vector-target DNA 
constructs are formed, and wherein at least two different 
DNA molecules each having a recombinational site (I) and 
further having a different probe/primer region are 
30 employed; and wherein the determining of the sequence of 
the target molecule is through use of two probes, each 
capable of hybridizing to only one of the probe/primer 
regions, or its complement. 

The invention also provides the embodiment of the 
35 above methods for det rmining nucleotide sequence of a 
target DNA molecule wherein the recombination is site- 
specific recombination, and, in particular, wherein in the 
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site-specific recombination, the recombinase is Cr , and 
at 1 ast one, and preferably both, of the recombinational 
sites (I) or (II) are loxP sites. The invention also 
provides the embodiments of the above methods wherein the 
5 recombinational site (I) is a 2SSS. site, or a mutant losffi 
site, and wherein the vector contains one wild-type lojcE 
site and one mutant loxP site. 

The invention also provides the embodiment of the 
above-described method for analyzing a target DNA 
10 molecule, wherein the analysis comprises ordering 
restriction endonuclease recognition sites in a target DNA 
molecule, and wherein, in step (B) , the analysis comprises 

(i) incubating the recombinant molecule in the 
presence of a restriction endonuclease tinder conditions 

15 sufficient to pexnnit the endonuclease to cleave DNA 
containing a cleavage site recognized by the endonuclease; 
and 

(ii) determining the order of any restriction sites 
in the target molecule using a nucleic acid molecule 

20 capable of hybridizing to a probe/primer sequence, or its 
complement. 

The invention also provides the embodiment of the 
above method for ordering restriction endonuclease 
recognition sites in a target DNA molecule wherein the 
25 recombinant molecule is formed by: 

(1) introducing the tcurget DNA molecule into at 
least one vector, having a recombinational site (II) , to 
thereby form a vector-target DNA construct; 

(2) incubating the vector-DNA construct in the 
30 presence of a recombinase, and a DNA molecule having the 

recombinational site (I) and a probe/primer region; 
wherein the incubation is under conditions sufficient to 
permit the recombinase to mediate recombination between 
the recombinational site (I) of the DNA molecule, and the 
35 recombinational site (II) of the vector-target DNA 
construct ; 
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(3) p rmlttlng the recombinase to mediate 
recombination between the recombinational sites (I) and 
(II), and to thereby form the recombinant molecule. 

The invention also provides the emboddLment of the 
5 above method for ordering restriction endonuclease 
recognition sites in a target DNA molecule wherein at 
least two vector-target DNA constructs are formed, and 
wherein at least two different DNA molecules each having 
a recombinational site (I) and further having a different 
10 probe/primer region are employed; and wherein the ordering 
of restriction sites of the target molecule is through use 
of two probes, each capable of hybridizing to only one of 
the probe/primer regions, or its complement. 

The invention also provides the embodiments of the 
15 above methods for ordering restriction endonuclease 
recognition sites in a target DNA molecule wherein the 
recombination is site-specific recombination, and, in 
particular, wherein in the site-specific recombination, 
the recombinase is Ore, and at least one, and preferably 
20 both, of the recombinational sites (I) or {II) are loxP 
sites. The invention also provides the embodiments of the 
above methods for ordering restriction endonuclease 
recognition sites in a target DNA molecule wherein the 
recombinational site (I) is a loxP site, or a mutant loxP 
25 site, and wherein the vector contains one wild-type loxP 
site and one mutant loxP site. 

The invention also provides a kit specially adapted 
to mediate recombination between a DNA molecule having a 
recombinational site (I), and a DNA vector, having a 
30 recombinational site (II), the kit comprising in close 
compartmentallzation: 

1) a first container containing a recombinase 
capable of mediating the recombination 
between the site (I) of the DNA molecule 
35 and the site (II) of the v ctor; and 
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2) a second c ritainer containing a DNA 
molecule having the r combinational sit 
(I). 

The invention also provides the embodiment of the 
5 above kit wherein the recombinase is Ore, and wherein at 

least one, and preferably both, of the recombinational 

sites (I) and (II) is a loxP site. 

The invention also provides the embodiment of the 

above kit wherein the kit additionally contains a third 
10 container containing the DNA vector. 

The invention also provides the embodiment of the 

above kits wherein the vector contains one lo3cP site, or 

wherein the vector contains one wild-type loxP site and 

one mutant loxP site. 
15 The invention also provides a set of nested 

oligonucleotides each of which has a first region of 

unknown sequence, and a second region of known sec[uence, 

wherein the second region comprises both a recombinational 

site, and a probe/primer region. 
20 The invention also provides the embodiments of the 

above set of nested oligonucleotides wherein at least one 

of the oligonucleotides is hybridized to a probe or to a 

primer. 

The invention also provides the embodiment of the 
25 above set of nested oligonucleotides wherein the 
recombinational site is a loxP site. 

BRIEF DESRTPTTOy OF THE FIGDKES 

Figure 1 shows the recombination of a circular DNA 
molecule having two loxP sites in direct orientation. 
30 Figure 2 shows the recombination of a circular DNA 

molecule having a single loxP site with a linear loxP - 
containing DNA molecule to produce a linear molecule 

Figxire 3 shows the recombination of the loxP sites in 
an inverted repeat orientation. 
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Figiire 4 illustrates the exchange f the DNA that 
results from recombination b tween two linear molecules. 

Figure 5 shows the use of Cre and loxP sites. 

Figure 6 shows the structure of a vector in which the 
5 recoiDbinational site is illustrated as a loxP site; the 
orientation of the loxP is always left to right relative 
to the other elements shown. 

Figure 7 shows an alternative vector containing two 
non-corresponding recombinational sites. 
10 Figure 8 shows the cloning of a target molecule into 

a vector. 

Figure 9 shows the cloning of a target molecule into 
an alternative vector containing two non-corresponding 
recombinational sites. 
15 Figtire 10 shows the structures of loxP-containing 

oligonucleotide having one or more probe primer regions, 
wherein the roman numerals indicate the presence of the 
probie/primer regions which may be different or the same in 
sequence . 

20 Figure 11 shows the linear molecule that is produced 

by incubating the molecules of Figures 8 and 9 together, 

in the presence of Ore. 

Figure 12 shows the set of nested fragments that is 

obtained when the molecule of Figure 10 is stibjected to 
25 partial restriction endonuclease digestion with a 

restriction enzymes, and analyzed by electrophoresis. 

Figure 13 shows the visualization of the nested 

fragments, and how such visualization facilitates 

restriction mapping. 
30 Figure 14 shows the linear molecule that result from 

recombination of the vector through the action of a 

suitable recombinase. 

Figure 15 shows the structxires of members of an 

unf ractionated vector library after recombination with one 
35 of a plurality of linear molecul s each of which differs 

from the other in the sequence of its probe/primer 

sequence . 
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Figure 16 shows the structiires of loxPSll -containing 
oligonucleotide having one or more prob primer regions, 
wherein the roman numerals indicate the presence of the 
probe/primer regions which may be different or the same in 

5 sequence « 

Figure 17 shows the structxire of a linear molecule 
containing more than one adjacent probe/primer regions 
separated by a recombinational site. 

Figures 18A and 18B show the linear molecule that 
10 would be produced using the loxP / Cre system, after Cre- 
mediated recombination with a plasmid of the type shown in 
Figure 8 and with an Oligonucleotide I (Primer #n - los^) 
or an Oligonucleotide II (Primer - Probe #n - ;LoxP) . 

Figure 19 shows the structure of a cosmid containing 
15 recombinational sites. 

Figure 20 shows a preferred, single-stranded loxl*- 
containing oligonucleotide that possesses a sequence which 
causes it to snap back upon itself. 

Figure 21 shows the result of recombination between 
20 the molecules of Figiire 18 and Figure 19. 

Figure 22 shows the structures of the molecules of an 
array of molecules obtained upon partial restriction 
endonuclease digestion of the molecules of Figure 21. 

Figure 23 shows the structure of a class of molecules 
25 having two loxP sites in a direct repeat that result from 
the incubation of the mixtture of oligonucleotides of 
Figure 22 with a DNA ligase in the presence of a second 
loxP -containing oligonucleotide, such as shown in Figiire 
20. 

30 Figure 24 shows the structvire of a class of molecules 

having two loxP sites in an inverted repeat that result 
from the incubation of the mixture of oligonucleotides of 
Figure 22 with a DNA ligase in the presence of a second 
loxP-containing oligonucleotide, such as shown in Figure 

35 20. 

Figure 25 shows a partially duplex molecule composed 
of certain oligonucleotid s. 
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Flgiire 26 shows a partially duplex molecule compos d 
of certain oligonucl otides. 

DESCRIPTION OF THE PREFERP Wn TMRp piMENTS ; 

I. RECOMBINATION 

5 The present invention uses the process of recombina- 

tion (Watson, J,D. , In! Molecular Bioloay of the Gene ^ 
4th Ed., W.A. Benjamin, Inc., Menlo Park, CA (1987), which 
reference is incorporated herein by reference) . Thus, an 
understanding of the process of recombination is desirable 

10 in order to fully appreciate the present invention • 

Recombination is a well-studied natural process which 
results in the scission of two nucleic acid molecules 
having identical or substantially similar sequences (i.e. 
"homologous"), and the joining of the two molecules such 

15 that one region of each initially present molecule becomes 
joined to a region of the other initially present molecule 
(Sedivy, J.M. , Bio-Technol. 6:1192-1196 (1988), which 
reference is incorporated herein by reference) . The 
recombinational reaction is catalyzed by enzymes, globally 

20 referred to as "recombinases." Such enzymes are naturally 
present in both prokaryotic and eukarybtic cells (Smith, 
G.R., In: Lambda II . (Hendrix, R. g£_al. , Eds.), Cold 
Spring Harbor Press, Cold Spring Harbor, NY, pp. 175-209 
(1983), herein incorporated by reference)). As discussed 

25 below, several recombinases are commercially available. 

Two types of recombinational reactions have been 
identified. In the first type of reaction, "general" or 
"homologous" recombination, any two homologous sequences 
can be recognized by the recombinase (i.e. a "general 

30 recoiabinase") , and thus act as substrates for the 
reaction. In contrast, the second type of recombination, 
"site-specific" recombination, employs specialized 
recombinases ( i.e. "site-specific recombinases") which 
can recogniz only certain defined sequences. Thus, in 
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site-specific recombinati • only molecules having a 
particular sequenc may act as substrates for the 
reaction. The significance of each type of recombina- 
tional reaction is discussed below. 

5 A. GENERAL RECOMBINATION 

General recombination is a process by which a 
"region" of DNA can be transferred from one DNA molecule 
to another. As used herein, a "region" of DNA is intended 
to generally refer to any nucleic acid molecule. The 

10 region may be of any length from a single base to a 
sxibstantial fragment of a chromosome • 

For general recombination to occ\ir between two DNA 
molecules, the molecules must possess a "region of 
homology" with respect to one another. Such a region of 

15 homology must be at least two base pairs long. Two DNA 
molecules possess such a "region of homology" when one 
contains a region whose sequence is so similar to a region 
in the second molecule that homologous recombination can 
occur. The transfer of a region of DNA may be envisioned 

20 as occurring through a multi-step process. 

If either of the two participant molecules is a 
circular molecule, then the above recombination event 
results in the integration of the circular molecule into 
the other participant. 

25 The frequency of recombination between two DNA 

molecules may be enhanced by treating the introduced DNA 
with agents which stimulate recombination. Examples of 
such agents include trimethylpsoralen, UV light, etc. 

The most characterized general recombination system 

30 is that of the bacterium E. coli (Smith, G.R. , In: Lambda 
II , (Hendrix, R. et_al. , Eds.), Cold Spring Harbor Press, 
Cold Spring Harbor, NY, pp. 175-209 (1983)). The E. coli 
system involves the protein, RecA, which in th presence 
of ATP or another energy sovirce, can catalyze the pairing 
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79:3398-3402 (1982); Sauer/ B.L., U.S. Patent No* 
4,959,317, herein incojrporated by reference). 

The recombination is mediated by a Pl-encoded protein 
known as "Ore" (Hamilton, D,L., et al . . J. Mol. Biol. 
5 178 ; 481-486 (1984), herein incorporated by reference). 
The Cre protein mediates recombination between two J.o3cP 
sequences (Sternberg, N., et al . . Cold Soring Harbor Svmp. 
Quant, Biol. 45:297-309 (1981)). These sequences may be 
present on the same DNA molecule, or they may be present 

10 on different molecules. Cre protein has a molecular 
weight of 38,000. The protein has been purified to 
homogeneity, and its reaction with the lo3tP site has been 
extensively characterized (Abremski, K. , et al,, J. }^ol. 
Biol. 259 :1509-1514 (1984), herein incoirporated by 

15 reference) . The cre gene (which encodes the Cre protein) 
has been cloned (Abremski, K. , et al . . Cell 32:1301-1311 
(1983) , herein incorporated by reference) . Cre protein 
can be obtained commercially from New England 
Nuclear /Dupont . 

20 The site specific recombination catalyzed by the 

action of Cre protein on two loxP sites is dependent only 
upon the presence of the above-described thirty-four base 
pair loxP site and Cre. Magnesium ions or spermidine are 
needed for efficient recombination. Energy, however, is 

25 not recjuired for this reaction; thus, there is no 
requirement for ATP or other similar high energy 
molecules. No proteins other than Cre are required in 
order to mediate site specific recombination at loxP sites 
(Abremski, K. , et al . , J. Mol. Biol. 259:1509-1514 

30 (1984)). In vitro, the reaction is highly efficient; Cre 
is able to convert up to about 70% of the DNA substrate 
into products and it appears to act in a stoichiometric 
manner. The extent of reaction reflects an equilibrivim 
among the vaarious molecules containing loxP sites. 

35 Cre-mediated recombination can occur between loxP 

sites which are both present on the same molecule, or 
which are present n two different molecules. Because the 
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internal spacer sequence of thfe loxP sit is asymmetrical, 
two loxP sites can exhibit directionality relative to one 
another (Hoess, R.H., et al >^ Proc > Natl . Acad , Sci , 
j[U>S>A, ) 81:1026-1029 (1984)) • When two sites on the same 
5 DHA molecule are in a directly repeated orientation (i.e. 
■ g ij ) , Cre will excise the DNA between the sites 

(Abremski, K. , g£ al.. Cell 32:1301-1311 (1983)). 

However, if the sites are inverted with respect to each 
other (i.e. ^ < ), the DNA between them is not 

10 excised after recombination but is simply inverted. Thus, 
a circular DNA molecule having two loxP sites in direct 
orientation will recombine to produce two smaller circles, 
whereas circular molecules having two loxP sites in an 
inverted orientation simply invert the DNA sequences 

15 flanked by the loxP sites (Figure 1) . 

Two circular molecules each having a single IbxP site 
will recombine to form a mixture of monomer, dimer, 
trimer, etc. circles. Higher concentrations of circles 
favor higher n-mers; lower concentrations of circles favor 

20 monomers. 

A circular DNA molecule having a single loxP site 
will recombine with a linear loxP-containing DNA molecule 
to produce a linear molecule (Figure 2) . 

As indicated above, a linear molecule with direct 

25 repeats of loxP sites interacts to produce a circle 
(containing the sequences between the loxP sites) , and a 
linear molecule. However, if the loxP sites are inverted 
repeats, recombination flips the sequence between the loxP 
sites back and forth (Figure 3) . 

30 When the starting DNA substrate is supercoiled, the 

final reaction product is also supercoiled (Abremski, K. , 
et al .> Cell 32:1301-1311 (1983); (Abremski, K., et al .. 
J. Biol. Chem. 261 :391-396 (1986)). The recombinational 
event does not, however, require supercoiling, and works 

35 with equal efficiency on supercoiled or linear molecules 
(Abremski, K. , et al . , Cell 32:1301-1311 (1983); Abremski, 
K., et al ., J- Biol. Chem, 261 :391-396 (1986)). 
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loxP sites present on the 150 kb genome of the pseudo- 
rabies virus (Sauer, B., et al . > Gene 70:331-341 (1988), 
herein incorporated by reference) . 

It has been found that certain E. coll enzymes 
5 inhibit efficient circularization of linear molecules 
which contain loxP sites at their termini. Hence, 
enhanced circularization efficiency can be obtained 
through the use of E, coli mutants which lack exonuclease 
V activity (Sauer, B., et al . ^ Gene 70 :331-341 (1988)). 
10 Cre has been able to mediate loxP specific 

recombination in mammalian cells (Sauer, B., et_al., Proc. 
Natl. Acad. Sci, (U.S.A.) 85:5166-5170 (1988), Sauer, B., 

et al.. Nucleic Acids Res. 17:147-161 (1989), both 

references herein incorporated by reference.) Similarly, 
15 the recombination system has been capable of catalyzing 
recombination in plant cells (Dale, E.G., et al . . Gene 
ai:79-85 (1990)) . 

2. The Flp Recombination System 

Yeast express a recombinase known as "Flp" which 
catalyzes the site-specific Inversion of a region of the 
yeast 2-/i circle plasmid (Schwartz, C.J. et al . ^ J, Molec, 
Biol. 205 :647-658 (1989); Parsons, R.L. et al . , J. Biol, 
Chem. 265:4527-4533 (1990); Golic, K.G. et al .. Cell 
59:499-509 (1989); Amin, A. A. et al . . J. Molec. Biol, 
214 :55-72 (1990)). The flp gene has been cloned, and the 
site (" FRT ") which is recognized by the Flp recombinase 
has been determined (Vetter, D., et al . , Proc. Natl. Acad. 
Sci. (U.S.A.) 80:7284-7288 (1983), herein incorporated by 
reference) • The organization of the FRT sequence is 
similar to that of the loxP sequence recognized by Cre; 
however, the sequence contains a nearly perfect inverted 
repeat and a direct repeat. Flp-mediated recombination 
optimally occurs in vitro at a pH of between 6.6 and 8.0. 
A divalent cation such as Mg"*^ or spermidine is required. 
The Flp protein can rec mbine 2-/i derivatives having 
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dir ctly repeated SSS. sites to produce two circular 
plasmids. It does not retjuire either host factors or 
supercoiled substrates (Vetter, D., ejL_al' / PffP^t ^^tlr 

^r.ad. Sci. m.S.A.) M:7284-7288 (1983) ) . 

3. The Gin/Fis Recombination System 

The E. coll bacteriophage Mu has been found to encode 
a protein (Gin) which can mediate recombination at 
specific sites in the Mu genome (Mertens, G. et ^X, , Sffifi 
J. 312415-2421 (1984); Mertens, G. et al.. J. BXoXi C^eyi. 
261:15668-15672 (1986)). The reaction causes a site- 
specific inversion of the Mu 6 segment and resvilts in an 
altered host range. Recombination occurs at 34 base pair 
long sites, which must be arranged as inverted repeats on 
supercoiled DNA. 

In order for inversion to occur, the phage encoded 
gin gene must be expressed. The product of this gene 
(Gin) binds to sites in each of the inverted repeats in a 
cooperative manner, induces a two base pair staggered 
nick, eind forms a covalent linkage with DNA at the 5' end 
of each nick. In the presence of Gin alone, DNA inversion 
occurs with low frequency both in vitro and in vjvo. In 
order to stimulate inversion, an E. coli host factor, 
known as "Fis" is typically required, unless a Fis- 
independent Gin mutant (i.e. a protein capable of 
catalyzing recombination in a site-specific manner without 
host factor) is employed. Such a mutant is disclosed by 
Klippel, A. et al . ( EMBO J. 7:3983-3989 (1988)). 

4. The X Recombination System 

The site-specific recombination system of the E. coli 
bacteriophage X has been well characterized (Weisberg, R. 
et al .. in: T-ambda II . (Hendrix, R. ^t ^1. , Eds.), Cold 
Spring Harbor Press, Cold Spring Harbor, NY, pp. 211-250 
(1983), herein incorporated by ref rence. Bacteriophage 
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X uses this recombinational syst m in order to integrate 
its genome into that of its host, the bacterium E. coli . 
The system is also employed to excise the bacteriophage 
from the host genome in preparation for virus' lytic 
5 growth. 

The recombination system is composed of four 
proteins- Int and Xis, which are encoded by the virus, and 
two host factors encoded by the coli . These proteins 
catalyze site-specific recombination between ''Att" sites. 

10 The X Int protein (together with the E. coli host 

integration factors) will catalyze recombination between 
"AttP" and "AttB" sites. If the AttP sequence is present 
on a circular molecule, and the AttB site is present on a 
linear molecule, the result of the recombination is the 

15 disruption of both Att sites, and the insertion of the 
entire AttP-containing molecule into the AttB site of the 
second molecule. The newly formed linear molecule will 
contain an AttL and an AttR site at the termini * of the 
inserted molecule. 

20 Even in the presence of host factors, the X Int 

enzyme, by itself, is unable to catalyze the excision of 
the inserted molecule. Thus, the reaction is 

unidirectional. If a second X protein, the X Xis protein, 
is added to the reaction, the reverse reaction can 

25 proceed, and a site-specific recombinational event will 
occur between the AttR and AttL sites to regenerate the 
initial molecules. 

The nucleotide sequence of both the Int and xis 
proteins are known, emd both proteins have been purified 

30 to homogeneity. Both the integration and the excision 
reaction can be conducted in vitro . The nucleotide 
sequences of the four Att sites has also been determined 
(Weisberg, R. et al . , In: Lambda II , (Hendrix, R. et al . , 
Eds.), Cold Spring Harbor Press, Cold Spring Harbor, NY, 

35 pp. 211-250 (1983), which reference has been herein 
incorporated by refer nee) . 
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5. Other Sit -Sp cific Recombination Systems 

Any of a large number of additional site-specific 
recombination systems can be used in accordance with the 
methods of the present invention. 
5 Such systems are discussed by Echols, H. (J. gj-Qjr' 

Chem, 265 ; 14697-14700 (1990)), de Villartay, J. P. (l?atuye 
335:170-174 (1988); Craig, N.L, (Ani>. I^ey, Genet, 22:77- 
105 (1988)), Poyart-Salmeron, C. et al. (]PMBO J. &:2425- 
2433 (1989)), Hunger-Bertling, K. et al, (ffoj-ect CbXI. 

10 Biochem, 92:107-116 (1990)), and Cregg, J.M. (Mo].ec, Gept 
Genet, 212:320-323 (1989)), all herein incorporated by 
reference. Examples of prefeirred additional recombination 
systems include: the Tpra and the iS-lactamase transposon 
systems (Levesque, R.C, J. Bacteriol. 122:3745-3757 

15 (1990)); the Tn3 resolvase system (Flanagem, P*M.'et_al. , 
J, Molec. Biol. 206 :295-304 (1989); Stark, W.M. et al.. 
Cell 58 5 779"790 (1989)); the yeast recombinase systems 
(Matsuzaki, H. et al .. J. Bacteriol. 172:610-618 (1990)); 
the B. subtilis SpoIVC recombinase system (Sato, T. et 

20 al., J- Bacteriol. 172:1092-1098 (1990)); the Hin 
recombinase system (Glasgow, A.C. et al. , J. Biol. Chem. 
264 :10072-10082 (1989)); the immunogolobulin recombinase 
systems (Malynn, B.A. et al .. Cell 54:453-460 (1988)); the 
Cin recombinase system (Hafter, P. et al . . EMBO J. 7:3991- 

25 3996 (1988); Hubner, P. et al . , J. Molec. Biol. 205:493- 
500 (1989)); the Pin recombinase system (Plasterk, R.H.A. 
et al .. Cold Spring Harbor Svom. Ouant Biol. ^49:295-300 
(1984) ; all of the above references are herein 
incorporated by reference. 

30 II. RECOMBINATION-FACILITATED SEQUENCE ANALYSIS 

As indicated above, the multiplex sequencing method 

of G.M. Church et al. ( Science M0:185-188 (1988)) 

requires the construction of a large number of vector 
libraries. In contrast, the present invention achi ves 
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the goal of multiplex sequencing without the n ed to 
construct multiple gen libraries. In accordance with th 
ipresent invention, DNA or cDNA from any desired source is 
obtained, and cloned into a cloning site of any of the 
5 well-known prolcaryotic, eukaryotic, or shuttle vectors 
vectors, modified to contain a recombinational site. 
Examples of suitable vectors are provided by Haniatis, T., 
et al. (In: Molecular Cloning, A Laboratory Manual. Cold 
Spring Harbor Press, Cold Spring Harbor, NY (1982) )• The 

10 sequence information is then obtained through the use of 
a novel method. 

Of particular importance to the present invention is 
the fact that recombination between two linear molecules 
results in the exchange of their DNA (Figure 4} • 

15 Thus, if in Figure 4, the sequence 1-7 represented a 

target molecule of unknown sequence, and the sequence H I 
contained a detectable marker, the result of the 
recombination would be to link the detectable marker to 
the target molecule* 

20 Any of the above-described recombinases and their 

corresponding recombinational sites may be employed. As 
used herein, a "recombinational site" is a region of a 
DNA molecule of a sequence and size sufficient to permit 
it to function as a substrate in a recombinational 

25 reaction when provided with a suitable recombinase, and a 
second DNA molecule having a suitable recombinational 
site. Where the recombinase is a general recombinase, the 
recombinational site can be of any size or sequence. More 
preferably, however, the recombinational sites are 

30 selected so as to be capable of serving as a substrate in 
a recombinational reaction catalyzed by a site-specific 
recombinase, preferably Ore. For example, where the 
recombinational sites are loxP sites, the recombinase 
would be Ore; where the recombinational sites are attP and 

35 attB sites, the recombinase would be Int. Any other 
combination of recombinase and recombinational site may be 
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once the ends of th target molecule hav b en so 
prepared, one of the above-describ d vector molecul s is 
cleaved with a restriction endonuolease capable of 
cleaving the vector within the cloning region, and forming 
termini which are capable of ligating with the target 
molecule. 

The target molecule is then introduced into the 
vector, and the vector is recircularized through the 
action of a DNA ligase. Procedures for accomplishing 
these steps are disclosed by Maniatis, T., gt ^X, (In: 
wr>; ^Ar,iilar ci«»inq. A Lf ^^»l'a^^Qrv Manual. Cold Spring Harbor 
Press, Cold Spring Harbor, NY (1982)). 

The use of a vector having two different recombina- 
tional sites facilitates the analysis of restriction sites 
15 at both ends of the target molecule. For the purposes of 
illustrating the invention, however, a vector containing 
a single recombinational site is depicted. The insertion 
of the target sequence is shown in Figure 8 and Figure 9. 
This vector is then incubated in the presence of Cre and 
20 a loxP- containing oligonucleotide having one or more probe 
primer regions (Figure 10) . The resulting recombination 
creates a linear molecule, shown in Figure 11. 

When such a molecule is subjected to partial 
restriction endonuolease digestion with a restriction 
25 enzymes, and analyzed by electrophoresis, a set of nested 
fragments containing the probe/primer region is obtained 
(Figure 12) . 

Significantly, since these nested fragments contain 
the probe/primer region of the original molecule, a probe 

30 having a sequence substantially complementary to that of 
the probe/primer region will be able to hybridize to the 
fragments. By labelling such a probe, it is thus possible 
to visualize the nested fragments which hybridize to the 
probe/primer region. Moreover, because the molecules 

35 share one end in common, the position of the restriction 
sites can be readily determined by measuring the sizes of 
the "bands," as shown in Figure 13. 
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It is possible to pr pare multiple sets of nested 
fragments using different restriction endonucleases, and 
loxP -containing oligonucleotides having different 
probe/primer regions. Each set of nested fragments can be 
5 visualized by incubating the total set of fragments with 
a probe substantially complementary to the probe/primer 
region of the respective set of nested fragments. Thus, 
through the use of multiple sets of different probes (each 
capable of hybridizing to a different probe/primer 

10 region) , the present invention permits one to sequentially 
analyze all of these nested sets of fragments. Indeed, if 
different Isibels are employed on different probes, it is 
possible to simultaneously analyze different sets of 
nested fragments. 

15 By conducting the analysis using sets of loxP - 

containing oligonucleotides which contain two probe/primer 
regions, one common to all of the oligonucleotides, and 
one that is varied to permit the individual visualization 
of a single set of nested fragments, it is possible to 

20 visualize all of the sets of nested fragments. 

Since such analyses may be conducted from a single 
gel, the present invention greatly facilitates the process 
of restriction mapping. 

EXAMPLE 2 

25 DNA SEQUENCE ANALYSIS 

In this aspect of the present invention, the sequence 
of a cloned region can be determined in a multiplex 
analysis. 

The method utilizes two types of DNA molecules. The 
30 first molecule is a cloning vector which contains a loxP 
site. Any of the well-known prokaryotic, exikaryotic, or 
shuttle vectors vectors may be modified to permit their 
use in the present invention. The vector shall contain at 
least one recombinational site, preferably loxP , which 
35 precedes and is adjacent to and, most pr ferably. 



wo 92/22650 



PCr/US92/04923 



-36- 



10 



immediat ly adjacent to a cldning r gion. The structure 
of the vector is as shown in Figure 6 or Figure 7. 

In accordance with the sequencing method of the 
present invention, a DNA molecule whose sequence is to be 
determined (i.e. a "target" sequence) is obtained from a 
suitable source. Preferably, such target DNA has been 
isolated using a restriction endonuclease which is also 
capable of cleaving at a site within the cloning region of 
the above-described vector. Alternatively, conventional 
methods can be used to adapt the ends of the target such 
that they are now capable of being ligated into a 
restriction site of the cloning region. This can, for 
example, be accomplished with any target by treating 
overhanging ends to produce a blunt ended target molecule. 
15 Once the ends of the target molecule have been so 

prepared, one of the above-described vector molecules is 
cleaved with a restriction endonuclease capable of 
cleaving the vector within the cloning region, and forming 
termini which are capable of ligating with the target 

20 molecule. 

The target molecule is then introduced into the 

vector, and the vector is recircularized through the 

action of a DNA ligase. Procedures for accomplishing 

these steps are disclosed by Maniatis, T., et al. (In: 
25 yfr»iecular <piontna. A Laborato ry Manual. Cold Spring Harbor 

Press, cold spring Harbor, NY (1982)). The construction 

is as shown in Figure 8 and Figure 9. 

Once the target DNA has been inserted into any of the 

above-described vectors, it can then be amplified, by 
30 propagating the vector in a suitable host. Individual 

members of the library (either as transformed cells, or 

isolated DNA) can then be isolated. 

In order to accomplish multiplex sequencing of the 

vector, one permits the vector to undergo site-specific 
35 recombination with a linear DNA molecule having a 

recombinational site, and at least one probe/primer 

region, located near the recombinational site (Figure 10) . 
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Preferably, the linear mol cul will have probe/primer 
regions on both sides of the r combinational site. 

In the presence of a suited3le recombinase (such as 
Cre) r the vector and the linear molecule recombine to form 
5 a linear molecule as shown in Figure 14. 

If the sequencing reaction is done using the Maxam- 
Gilbert method, then a probe capable of hybridizing to 
probe/primer I will identify one set of nested sequencing 
reaction products. 

10 Alternatively, the target can be 5ec[uenced using the 

Sanger method by employing a primer having a 3»0H termini 
which is capable of hybridizing to the probe/primer 
sequence of the fragment. Extension of this primer 
creates a set of nested sequencing reaction products. 

15 Significantly, a probe/primer is capable of 

identifying only that nest of reaction products which has 
a probe/primer sequence to which it can hybridize. Thus, 
the use of a probe capable of hybridizing to a different 
probe/primer sequence will identify a different set of 

20 nested sequencing reaction products. 

This featxire of the present invention permits a 
multiplex analysis to be performed. To accomplish this, 
members of the unf ractionated vector library are 
separately permitted to recombine with one of a plurality 

25 of lineeu: molecules each of which differs from the other 
in the sequence of its probe/primer sequence. The result 
of such recombination may be depicted asshown in Figure 
15. 

Since a probe/primer capable of hybridizing to one 
30 probe/primer sequence (for example probe/primer sequence 
I in Figure 15) will identify only that set of nested 
sequence reaction products which contains the probe/primer 
sequence, all of the sec[uence reactions may be combined 
and analyzed on the same sequencing gel, by sequentially 
35 hybridizing with a different probe/primer sequence. Thus, 
where probe/primers capable of hybridizing to probe/primer 
secjuences I, II, III, and IV are used, a single secpiencing 
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g 1 can be used to determine the sequenc of target 
molecules A, B, C and D. 

Use of a vector having a second recombinational site 
(such as 1QXP51H adjacent to the second end of the 
5 inserted target molecule, in conjunction with a linear 
molecule (such as shown in Figure 16) permits one to 
identify a set of nested sequencing reaction products of 
the other strand of the target molecule. Thus, such a 
probe would permit the sequencing of the second strand of 
10 a DNA molecule (or equivalently, would yield sequence 
information relevant to the 3' end of the sequence of 
depicted target molecule A) . 

Significantly, the invention thus permits the 
sequencing of both strands of a DNA molecule on the same 

15 sequencing gel. 

As indicated above, in one embodiment, the linear 
molecule will contain more than one adjacent probe/primer 
regions separated by a recombinational site (Figure 17) . 
If one employs a set of such linear molecules in which 

20 probe/primer Z or W are kept invariant, whereas the 
sequence of probe/primer sequence I or II is varied, then 
one has the capacity to perform either a multiplex 
sequence analysis using probe/primers capable of 
hybridizing to the sequence of probe/primer sequence I or 

25 II or their variants or non-multiplex sequence analysis 
(using probe/primers capable of hybridizing to the 
sequence of probe/primer sequence Z or W) . 

EXAMPLE 3 

DETAILED DESCRIPTION OF MULTIPLEX SEQUENCE ANALYSIS 



30 



To perform multiplex sequence analysis, a series of 
oligonucleotides are constructed. These oligonucleotides 
will have a probe/primer region, which may have either of 
two general structures (depicted using loj^ as the 
recombinational site) : Oligonucleotide I (Primer #n -IgasE) 
35 or Oligonucleotide II (Primer - Probe #n - l22£E) where the 
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n indicates th numb r of different oligonucleotides in 
the series. 

The above-described vectors are incubated in the 
presence of a recombinase and each of the oligonucleotides 
5 of the series, in separate reactions. To illustrate this 
aspect of the invention, using the loxP / cre system, 
after Cre-mediated recombination with a plasmid of the 
type shown in Fig\ire 8 and with an Oligonucleotide I 
(Primer #n - lajSE) # the linear molecule shown in Figure 

10 18A would be produced. 

After Cre-mediated recombination with a plasmid of 
the type shown in Figure 8 with an Oligonucleotide II 
(Primer - Probe #n - loxP ^ the linear molecule shown in 
Figure 18B would be produced. 

15 As will be appreciated, either of such molecules can 

be employed to determine the sequence of the target 
molecule using either the Sanger or Maxam-Gilbert methods. 
Where oligonucleotides of class I are employed, n 
different primers would be added to each sequencing 

20 reaction, and the probes to detect the sequencing products 
would be the complements of the primers. Since such 
different probes are being used, it is possible to analyze 
all n sequence reactions using a sequencing gel, through 
a multiplex sequence analysis. Where oligonucleotides of 

25 class II are employed, a single primer would be used in 
the sequencing reactions, and different probes (l, 2, 
etc.) would be used. This latter embodiment is preferred, 
except that target sequence would not be reached xintil 
about 77 nucleotides from the 5' end of the primer (i.e. 

JO 20 nucleotides of the primer, 20 nucleotides of the probe, 
34 nucleotides of the loxP site, and 3 nucleotides from 
the remainder of the cloning site (e.g. Smal) . When one 
wishes to eliminate the need to sequence 20 of these 
nucleotides, one would include a deoxyuracil (dU) toward 

15 the 3« end of the primer, and treat with the enzyme udg 
(Uracil DNA Glycosylase) just before running the 
sequencing gel. This treatment renders the sites abasic. 
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but does not cleave the phophodiester backbone of the DNA 
mol cule. Cleavage may be accomplished by heating the 
reaction, or by incubating it in the presence of an enzyme 
(such as endonuclease IV of l,__SSli) capable of 
5 specifically cleaving nucleic acid molecules at abasic 
sites. Note that after Cre-i22^ recombination, the 
priming sites would be completely single stranded, even 
without denaturation, since the recombination 
oligonucleotide would be single stranded in the primer 
10 domain. 

The above method permits one to sequence from one 
side of a target molecule. The use of a vector containing 
two recombinational sites permits one to sequence from 
both sides of the molecule. 

Table 1 compares the ability of the vectors and 
methods of the present invention, with those of the 
multiplex method, to facilitate multiplex sequence 
analysis of ten sequencing reactions with a target DNA 
molecule. 



15 



wo 92/22650 



PCr/US92/04923 



-41- 



TABLE 1 


ATTRIBUTE BEING 
COMPARED 


Cre/loxP System 


Multiplex 
Method 


Oligonucleotide 


I 


II 


Number of Vectors 
Required 


1 


1 


5 


N\unber of Libraries 
Required 


1 


1 


5 


Number of Recombination 
Oligonucleotides 


10 


10 


0 


Ntunber of Primers 
Required 


10 


1-2 


2 


Approximate Number of 
Nucleotides from 5' End 
of Primer to Target 


57 


77 


43 


Number of Probe 

Oligonucleotides 

Required 


10 


10 


10 


Recombinase Required 
(Cre) 


Yes 


Yes 


No 



The present invention includes articles of 
manufacture, such as "kits." In one embodiment, such 
kits will, typically, be specially adapted to contain in 
close compar1:mentalization a first container which 

25 contains a DNA vector, which has at least one 
recombinational site (I); a second container which 
contains at least one probe /primer DNA molecule having a 
recombinational site (II) , and a probe/primer region; and 
a recombinase capable of mediating recombination between 

30 site (I) of the DNA vector and site (II) of the 
probe/primer DNA molecule. The kit may additionally 
contain multiple probe/primer DNA molecules, which may be 
used to facilitate the multiplex sequence analysis of DNA 
in accordance with the methods of the invention. The kit 

35 may additionally contain instructional brochures, and the 
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like. It may also contain reagents sufficient to 
accomplish DNA sequencing. 

In a second embodiment, such kits will, typically, be 
specially adapted to contain in close compartmentaliza- 
5 tion a first container which contains a DNA molecule (such 
as a linear oligonucleotide or a vector) which has at 
least one recombinational site (I); a second container 
which contains at least one DNA molecule having a 
recombinational site (II) , and is detectably labelled; and 

10 a recombinase capable of mediating recombination between 
the recombinational site (I) of the DNA molecule and site 
(II) of the labelled oligonucleotide. The kit may 
additionally contain instructional brochures, and the 
like. It may also contain reagents sufficient to 

15 accomplish DNA sequencing. 

EXAMPLE 4 
SEQUENCING OF A COSMID MOLECDLE 

) 

The present invention facilitates the sequencing of 
cosmid molecules. In this method, a cosmid is constructed 

20 so as to contain a loxP site (Figure 19) . The molecule is 
incubated in the presence of Cre and a lojgE-containing 
oligonucleotide, preferably, the oligonucleotide is 
single-stranded, and will possess a sequence which causes 
it to snap back upon itself (Figure 20) . As a result of 

25 such incubation, a linear molecule will be produced having 
the structure shown in Figure 21. Upon restriction 
endonuclease digestion, an array of partial-digestion 
products such as those shown in Figure 22 are obtained. 

As will be recognized, the effect of the reaction has 

30 been to produce a series of oligonucleotides which contain 
at most, only one I22P site. This mixture of 
oligonucleotides is then incubated with a DNA ligase in 
the presence of a second iojde-containing oligonucleotide, 
which will preferably be single-stranded, and poss ss a 

35 sequence which causes it to snap back upon itself, such as 
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shown In Figure 20, As a result of such incvibation, thr e 
general classes of mol cules will b present In the 
reaction: 

(I) those with only one loxP site, 
5 (II) those having two loxP sites in a direct repeat 

(Figure 23} , and 
(III) those having two loxP sites in an inverted 
repeat (Figure 24) . 
As will be perceived, only molecules which contain 
10 the target sequences that were initially boimd to the loxp 
site of the first DNA molecule (i.e. A-B sequences) will 
contain two directly repeated loxP sites (i.e. class II 
molecules) . Such molecules thus contain target ONA rather 
than cosmid vector DNA. 
15 The mixture of molecules will then preferably be 

separated, as with agarose gel electrophoresis, or other 
conventional means, and the different sizes of molecules 
eluted or otherwise recovered. 

The directly repeat loxP sites present on these 
20 molecule permits one, in a Cre-mediated reaction, to 
recombine the cloned DNA between these sites into any of 
the loxP -containina vectors discussed above. 

Thus, this method permits one to subclone target DNA 
from a cosmid into a smaller vector. Significantly, the 
25 cloned DNA is manipulated such that it becomes flanked 
with directly repeating loxP sites. Moreover, the method 
permits one to obtain and clone a set of nested 
oligonucleotide fragments of a desired target molecule. 

EXAMPLE 5 

30 MULTIPLEX RESTRICTION FRAGMENT ANALYSIS 

The use of Cre/lojcP mediated site-specific 
recombination as a method to facilitate multiplex mapping 
was demonstrat d by the following procedur . The target 
molecules were pLox, a 2.9 kb plasmid with a loxP site 
35 cloned into a polylinker region, and pSPORT-lox, a 4.1 kb 
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plasmid with a loxP site inserted into its multiple 
cloning site (MCS) . 

The sequences chosen for hybridization probes were 
taken from Church et al . (SsiSSSfi 185-188 (1988)) and 
the hybridization, washing, and probe stripping procedures 
disclosed therein were used with minor modification. 
Specifically, the hybridization probes used were (in the 
Church et al . nomenclature) POl, P02, P03 and P04. 

Recombinant molecules to be eventually used as 
substrates for multiplex mapping were generated as 
follows. A partially duplex molecule which was composed 
of the oligonucleotides, as shown in Figure 25, was 
incubated with pSPORT-lox in the presence of Cre under 
conditions sufficient to permit recombination to occur. 
Specifically, the reaction contained 1 pmol of plasmid, 
4 pmol of oligonucleotides and 5 units of Cre (NJW) in 
buffer composed of 50 mM Tris-HCl (pH 7.5) , 33 mM NaCl, 5 
ifiM spermidine, 0.5 mg/ml bovine senam albumin (BSA);- 
incubations were at 37 'C for 15 minutes. A separate 
reaction containing pLox and a second partially duplex 
oligonucleotide of the structure shown in Figure 26 was 
also incubated in the presence of Cre such that 
recombination took place. After inactivation of the Cre 
by heating to 65 "C for 10 minutes, portions of these 
recombination reactions were either kept separate or mixed 
together and then subjected to partial digestion with the 
restriction endonuclease Haelll or Hhal. The products 
were resolved on an agarose gel. After electrophoresis, 
an overnight alkaline transfer to a charged nylon membrane 
(BioDyne-B) was performed (Reed and Mann, Nucleic Acids 
Res. 13;7207-7221 (1985)). 

Pre-hybridizations and hybridizations were performed 
in the buffers of Church et al . f Science 240:185-188 
(1988)), however, incubations were carried out at 37«C 
rather than 42''C and hybridizations wer extended 
overnight, oligonucleotide probe (i. ., POl, P02, P03 and 
P04 [of Church et al . . Science 240; 185-188 (1988))] were 
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labeled with T4 polynucleotide kinase and [y^^P] ATP. All 
washes were performed at room temperat\ire and consisted of 
two washes with 6xSSC for 2 minutes each followed by 
washing with 2xSSC + 0.1% sodivua dodecyl sulfate (SDS) , 
5 2.5 minutes total with one change of buffer. (20xSSC » 3M 
NaCl, 0.3N Na Citrate, pH 7.0). Membranes were then 
subjected to autoradiography to determine the linear map 
of the respective restriction sites. 

Probe was then stripped from the membrane by 

10 incubation in 2mM Na2 EDTA + 0.1% SDS (adjusted to pH8.3 
with Tris base) at 65*>C for 10 minutes. Removal of probe 
was verified by autoradiography and the hybridization, 
washing and visualization process was repeated with a 
different radioactive probe. 

15 This method was shown to be highly specific with no 

background from cross-hybridization. It yielded accurate 
fine structure maps of both substrates. Significantly, 
even within lanes which contained mixtures of the two 
targets, each pattern could be detected independently, 

20 sequentially, and with complete specificity. 

While the invention has been described in connection 
with specific embodiments thereof, it will be understood 
that it is capable of further modifications and this 
application is intended to cover any variations, uses, or 

25 adaptations of the invention following, in general, the 
principles of the invention and including such departures 
from the present disclosure as come within known or 
customary practice within the art to which the invention 
pertains and as may be applied to the essential features 

30 hereinbefore set forth and as follows in the scope of the 
appended claims. 
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1. A method for analyzing a target DNA molecule, which 
comprises: 

(A) forming a recombinant molecule, said recombinant 
5 molecule comprising a probe/primer sequence linked to a 

recombinational site (I) , wherein said site is linked to 
the sequence of said target molecule; and 

(B) analyzing said target molecule using a nucleic acid 
molecule capable of hybridizing to said probe/primer 

10 sequence, or its complement. 

2. The method of claim 1, wherein said analysis 
comprises determining a nucleotide sequence of said target 
DNA molecule, wherein, in step (B) , said analysis 
comprises determining the sequence of the target molecule 

15 using a nucleic acid molecule capable of hybridizing to 
said probe/primer sequence, or its complement. 

3. The method of claim 2, wherein said recombinant 
molecule is fomned by; 

(1) introducing said target DNA molecule into at 
20 least one vector, having a recombinational site (II) , to 

thereby form a vector-target DNA construct; 

(2) incubating said vector-DNA construct in the 
presence of a recombinase, and a DNA molecule having said 
recombinational site (I) and said probe/primer region; 

25 wherein said incubation is under conditions sufficient to 
permit said recombinase to mediate recombination between 
said recombinational site (II) of said vector-target DNA 
construct and said recombinational site (I) of said DNA 
molecule; and 

30 (3) permitting said recombinase to mediate 

recombination between said recombinational sites, and to 
thereby f rm said sequencing molecule. 
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4. The method of claim 3, wherein at least two v ctor- 
target DNA constructs are formed, and wherein at least two 
different DNA molecules each having a recombinational site 
(I) and further having a different probe/primer region are 

5 employed; and wherein said determining of the sequence of 
the target molecule is through use of two probes, each 
capable of hybridizing to only one of said probe/primer 
regions, or its complement. 

5. The method of claim 3, wherein said recombination 
10 is site-specific recombination. 

6 • The method of claim 5 , wherein in said site- 
specific recombination, said recombinase is Ore, and at 
least one of said recombinational sites (I) or (II) are 
loxP sites. 

15 7. The method of claim 6, wherein said recombinational 

site (I) is a loxP site. 

8. The method of claim 6, wherein said vector contains 
one wild-type loxP site and one mutant loxP site. 

9. The method of claim 1, wherein said analysis 
20 comprises ordering restriction endonuclease recognition 

sites in a target DNA molecule, and wherein, in step (B) , 
said analysis comprises 

(i) incubating said recombinant molecule in the presence 
of a restriction endonuclease \inder conditions sufficient 

25 to permit said endonuclease to cleave DNA containing a 
cleavage site recognized by said endonuclease; and 

(ii) determining the order of any restriction sites in 
said target molecule using a nucleic acid molecule capable 
of hybridizing to said probe/primer sequence, or its 

30 complement. 
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10. The m thod of claim 9, wher in said recoinbinant 
molecule is formed by: 

(1) introducing said target DNA molecule into at 
least one vector, having a recombinational site (II) , to 

5 thereby form a vector-target DNA construct; 

(2) incubating said vector-DNA construct in the 
presence of a recombinase, and a DNA molecule having said 
recombinational site (I) and said probe/primer region; 
wherein said incvibation is under conditions sufficient to 

10 permit said recombinase to mediate recombination between 
said recombinational site (I) of said DNA molecule, and 
said recombinational site (II) of said vector-target DNA 

construct; and 

(3) permitting said recombinase to mediate 
15 recombination between said recombinational sites (I) and 

(II) , and to thereby form said recombinant molecule. 

11, The method of claim 10, wherein at least two 
vector-target DNA constructs are formed, and wherein at 
least two different DNA molecules each having a 

20 recombinational site (I) and ftarther having a different 
probe/primer region are employed; and wherein said 
ordering of restriction sites of the target molecule is 
through use of two probes, each capable of hybridizing to 
only one of said probe/primer regions, or its complement, 

25 12. The method of claim 10, wherein said recombination 

is site-specific recombination. 

13. The method of claim 12, wherein in said site- 
specific recombination, said recombinase is Cre, and at 
least one of said recombinational sites (I) and (II) is a 
30 loxP site. 



14. The method of claim 13, wherein said 
recombinational site (I) is a loxP site. 
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15. The m thod of claim- 13, wherein said vector 
contains one loxP site and one mutant loxP sit . 

16. A kit specially adapted to mediate recombination 
between a DNA molecule having a recombinational site (I) , 

5 and a DNA vector, having a recombinational site (II) , said 
kit comprising in close compartmentalization: 

1} a first container containing a recombinase 
capable of mediating said recombination between 
said site (I) of said DNA molecule and said 
10 site (11) of said vector; and 

2) a second container containing a DNA molecule 
having said recombinational site (I) . 

17. The kit of claim 16, wherein said recombinase is 
Ore, and wherein at least one of said recombinational 

15 sites (I) and (II) is a loxP site. 

18. The kit of claim 16, which additionally contains a 
third container containing said DNA vector. 

19. The kit of claim 18, wherein said vector contains 
one loxP site. 

20 20. The kit of claim 18, wherein said vector contains 

one wild-type loxP site and one mutant loxP site. 

21. A set of nested oligonucleotides each of which has 
a first region of unknown sequence, and a second region of 
known sequence, wherein said second region comprises both 

25 a recombinational site, and a probe/primer region. 

22. The set of oligonucleotides of claim 21, wherein at 
least one of said oligonucleotides is hybridized to a 
probe . 



PCr/US92/04923 



-50- 

23. The set of oligonucleotld s of claim 21, wher in at 
least one of said oligonucl otides is hybridized to a 
primer. 



24. The set of oligonucleotides of claim 21, wherein 
5 said recombinational site is a loxP site. 
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